A meta-analysis of XRCC1 single nucleotide polymorphism and susceptibility to gynecological malignancies

Abstract Background: Gynecological malignant tumor is a serious threat to women's health, cervical cancer, endometrial cancer and ovarian cancer are the most common. The eponymous protein encoded by the XRCC1 (X-ray repair cross complementation 1) gene is an important functional protein in the process of single-stranded DNA damage. Non-synonymous mutations of XRCC1 gene cause amino acid sequence changes that affect protein function and DNA repair ability, and may affect the interaction with other DNA repair proteins, leading to increased risk of tumor development. Many studies have assessed the association between XRCC1 gene polymorphism and the risk of cancer in the female reproductive system, but the results have been inconclusive. In this study, the relationship between XRCC1 Arg399Gln, Arg194Trp, Arg280His single nucleotide polymorphisms and susceptibility to gynecological malignancies was further explored by meta-analysis. Methods: English database: Pubmed, Medline, Excerpta Medica Database, Cochrance, etc; Chinese database: China national knowledge infrastructure, Wanfang Database, etc. STATA14 was used for statistical analysis, such as odd ratio (OR) value, subgroup analysis, heterogeneity test, sensitivity analysis, and publication bias. Results: In gynecologic cancers, the allele frequency difference of Arg399Gln case control group was statistically significant (GvsA: P = .007). There was no significant difference in allele frequency in the Arg194Trp and Arg280His case control groups (P = .065, 0.198). In different gene models, Arg399Gln was significantly correlated with gynecologic cancers susceptibility (GGvs AA: OR 0.91; 95% confidence interval [CI], 0.85 0.98); Arg194Trp was significantly correlated with gynecologic cancers susceptibility (CCvs TT: OR 0.94; 95% CI 0.88,1.00; CCvs CT: OR 0.97; 95% CI 0.90, 1.05); Arg280His was significantly correlated with gynecologic cancers susceptibility (GGvs AA: OR 0.98; 95% CI 0.94, 1.02; GGvs GA: OR 1.00;95% CI 0.97, 1.04). In the subgroup analysis, Arg399Gln and Arg194Trp were significantly correlated with gynecologic cancers susceptibility in the Asian race (P = .000, 0.049). In the analysis of different cancer subgroups, Arg399Gln and cervical cancer susceptibility were statistically significant (P = .039). Arg194Trp and endometrial cancer susceptibility were statistically significant (P = .033, 0.001). Conclusions: XRCC1 Arg399Gln, Arg194Trp, Arg280His single nucleotide polymorphisms were associated with gynecologic cancer susceptibility. Arg399Gln genotype was statistically significant in relation to cervical cancer susceptibility. Arg194Trp genotype was statistically significant in relation to endometrial cancer susceptibility.


Introduction
Gynecological malignant tumors are major diseases that seriously threaten women's health. Cervical cancer, endometrial cancer and ovarian cancer are the most common, and surgical treatment, radiotherapy and chemical therapy are the main treatment methods. Cervical cancer is the most common gynecologic tumor, with about 530,000 new cases and 260,000 deaths each year globally. With the popularization of cervical cancer screening, the incidence of cervical squamous epithelial lesions and cervical squamous cell carcinoma has decreased. However, the incidence of cervical adenocarcinoma is on the rise, with the proportion of cervical cancer rising from 5% to about 20%. The prognosis of early cervical cancer is relatively good, but there are still some problems in the diagnosis and treatment of cervical adenocarcinoma. [1,2] At present, in western developed countries, the incidence of endometrial cancer ranks the first among malignant tumors of the female reproductive system, and its mortality is second only to ovarian cancer. [3] Although endometrial cancer has a good prognosis in general, its increasing morbidity and mortality make its prevention and control situation increasingly severe. [4] Ovarian cancer occupies the third place in the incidence of female genital malignancies, but its mortality is the highest. Because the early clinical symptoms of ovarian cancer are not obvious, the onset is insidious, and there is no reliable detection method, about 60% of ovarian cancer patients have advanced tumor and extensive metastasis at the first diagnosis, and the 5-year survival rate is only 40% to 45% according to statistics. [5] Cancer is a multifactorial disease, and genetic factors are important factors affecting its genetic susceptibility. Single nucleotide polymorphisms (SNPs) are the most common genetic variation, accounting for about 90% of human genetic variation, and some loci have been shown to be related to gene phenotypes and tumor susceptibility. [6,7] So it is important to find new molecular markers that are sensitive to cancer.X-ray cross complementary repair gene 1 (XRCC1) the size of about 33 KB, located in the chromosome 19 q13. 2-19 q13. 3, contains 17 exon, DNA damage repair mechanism is in the way of base excision repair of the important genes, and its main with a variety of enzymes including poly ADP ribose polymerase, DNA polymerase beta, and DNA ligase III form compounds involved in DNA repair process. [8] Current XRCC1 polymorphism studies mainly focus on 3 nonsynonymous mutant SNPs, namely Arg194Trp (rs1799782), Arg280His (rs25489), Arg399Gln (rs25487). The relationship between XRCC1 polymorphism and susceptibility to malignant tumors (such as nasopharyngeal cancer, breast cancer, lung cancer, stomach cancer, liver cancer, pancreatic cancer, colorectal cancer, prostate cancer, glioma, etc) has been reported many times. [9][10][11][12][13][14][15] Among the 3 nonsynonymous mutated SNPs, Arg399Gln, and Arg194Trp were most correlated with cervical cancer susceptibility, but the conclusions were inconsistent. Therefore, this study used metaanalysis method to explore the relationship between XRCC1 Arg399Gln, Arg194Trp, Arg280His and susceptibility to 3 common female reproductive system tumors.
Many studies have assessed the association between polymorphism in the XRCC1 gene and the risk of cancer in the female reproductive system, but the results have been inconclusive. In this study, the relationship between XRCC1 Arg399Gln, Arg194Trp, Arg280His single nucleotide polymorphisms and susceptibility to gynecological malignancies was further explored by meta-analysis.

Protocol registration
The protocol was based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols (PRISMA-P) statement guidelines. In The English database, the combination of subject words and free words is connected by "OR," and between keywords is connected by "And." Single nucleotide Polymorphism subject word "Polymorphism," free words "Genetic Polymorphism, Polymorphism (Genetics), Genetic isms." X ray cross complementing repair gene 1 subject word "XRCC1," free word "X-ray cross-complementing 1 or Arg399Gln, Arg194Trp, Arg280His." Female reproductive tract malignant tumor subject word "Female Reproductive system Cancer"; free word "Gynecological cancer." Ovarian Cancer subject word "Ovarian Neoplasms," free words "Ovarian Cancer, epithelial," etc. Cervical Cancer subject word "Cervical Carcinoma," free word "Cervical Cancer," etc. Endometrial Cancer subject word "Endometrial Cancer," free word "Endometrial Carcinoma," etc. The Chinese database use "XRCC1," "gene polymorphism," "gynecological tumor," "ovarian cancer," "cervical cancer," "endometrial cancer" and "susceptibility" as keywords, and there is no limitation in languages.
2.4. Literature screening, data extraction and quality evaluation 2.4.1. Newcastle-ottawa scale. A self-made data collection table was adopted, literature selection and data entry were completed independently by two people, and disputes were   The full score is 9, and those with scores ≥6 are of high quality, while those with scores less than 6 are of low quality. Quality evaluation (newcastle-Ottawa Quality Assessment Scale≥6) is the inclusion criterion. [16] 2.4.2. Hardy-weinberg equilibrium. Hardy-weinberg equilibrium: when the gene is passed from generation to generation without the influence of evolution, the gene frequency and genotype frequency of the population will remain unchanged. P > .05 is the inclusion criteria.

Linkage disequilibrium.
Linkage Disequilibrium (LD): also called allelic association. In the linkage disequilibrium, there is a deviation between the probability of haplotype appearing on the same chromosome and the probability of random combination, which is the degree of LD and is caused by mutation. Theoretically, the size of LD is related to the distance between two sites. The smaller the distance is, the less the chance of recombination will be and the stronger the linkage imbalance will be.
In the case of Hardy-weinberg equilibrium, the probability of AB is: P (AB) = P (A)x P (B), If there is a linkage imbalance, then the probability of AB is P (AB). The difference between these two probabilities reflects the degree of chain imbalance, namely the index D. D = P (AB)-P (A)x P (B). D = 0 means complete linkage equilibrium, when alleles on different loci are combined according to the random principle, the frequency of allele combination is equal to the product of the respective frequencies of alleles. R 2 = 0 means that the chain is perfectly balanced, a random combination. R 2 = 1 means that the linkage is completely unbalanced and there is no recombination, indicating that alleles at 2 loci have the same frequency, and the occurrence of an allele at 1 locus completely predicts the occurrence of corresponding alleles at the other locus.
When D is not 0, it indicates that there is linkage imbalance between the 2 genes. D is between 0 and 1, and the greater the D is, the higher the degree of linkage is. D > O means that the probability of the existence of two alleles (AB) on the same chromosome is greater than the probability of the occurrence of both alleles due to random distribution in the population. It is said that these 2 points are in the state of LD and there is an allelic association, which is of great significance for the study of gene correlation. For example, the linkage imbalance between SNP1 (G/A) and SNP2(C/T) was observed to be associated with disease susceptibility, and haplotype AC was A disease-related risk factor. D between 0-1 is the inclusion criteria. [17][18]

Statistical method
Five gene models were used: homozygous gene model (GG vs AA), heterozygous gene model (GG vs GA), dominant gene model (GG vs GA + AA), recessive gene model (AA vs GG + GA), and allelic gene model (G vs A).
STATA14 was used for data analysis. The specificity and sensitivity were measured by combined OR, the interval estimation was expressed by 95% confidence interval [CI], and P < .05 was considered statistically significant. The heterogeneity was analyzed by Q test. When P < .10, the heterogeneity was indicated, and the random effect model was used; otherwise, the fixed effect model was used. The heterogeneity was represented by I 2 . When I 2 > 50% or P < .10, the random effect model was used; otherwise, the fixed effect model was used. [19] Subgroup analysis was performed according to different conditions. If necessary, sensitivity analysis was performed and funnel plot was used to detect publication bias.
Because of the difference of the transcriptome sequencing expression is analysis of gene expression values of a large number of independent statistical hypothesis test, the problems will be false positives, so in the process of analyzing differentially expressed, the recognized Benjamini -Hochberg correction method of hypothesis testing have been the original significance P values for correction, and eventually the FDR (false discovery rate) as the key indicators of screening differentially expressed genes. FDR < 0.01 or 0.05 is generally taken as the default standard.

Data collection and analysis
3.1.1. Literature search results. Two thousand three hundred thirty three references were retrieved from various databases, and 1576 references were obtained after reading titles and removing duplicates. Reading abstracts excluded 1156 non-relevant literatures that did not meet the inclusion criteria, and 420 of them were obtained. After reading the full text, 55 incomplete literatures were excluded, and 33 literatures were finally included. The retrieval process is shown in Figure 1 3.1.2. The basic characteristics and quality evaluation of the included literature. Thirty three articles were included as case control studies, with a total of 6233 cases in the case group and 8555 cases in the control group. The samples were all from venous blood and were tested by different genetic methods. Khokhrin [31] only gave the genotype distribution frequency, and Khrunin A V [54] published its gene sequencing results and typing, but did not control the influence of other confounding factors. Ma Ning [25] genotype data were incomplete, and the literature was excluded. Baseline characteristics and quality scores of references  are shown inTable 1.
3.1.3. Genotype distribution of the included literature. In this study, 3 SNP loci of XRCC1 gene Arg399Gln, Arg194Trp and Arg280His were genotyped, and the linkage imbalance relationship between loci was analyzed. The genotype distribution frequency of the case group and the control group, among which, Zhang and Li Medicine (2021) 100: 50 www.md-journal.com whether the genotype distribution conforms to the Hardyweinberg equilibrium (P > .05) is shown in Table 2. Results: Most of the genotypes of the 3 SNP sites were distributed in Hardyweinberg equilibrium. Linkage disequilibrium analysis was carried out between 3 SNP sites in pairs in 5 literatures and 2 SNP sites in 10 literatures. Three SNPS can form 6 haplotypes:GCG, GCA, GTA, ACG, ACA, ATG, and ATA. According to the analysis of haploview software, the linkage disequilibrium coefficient D > 0 was found, indicating the strong linkage disequilibrium existed in the 3 SNP sites of XRCC1 gene. Conclusion: The 3 SNP sites of Arg399Gln, Arg194Trp and Arg280His of XRCC1 gene showed complete linkage disequilibrium.
3.2. Statistical analysis 3.2.1. Meta-analysis data. Five gene models were used for analysis. The heterogeneity was represented by I 2 , and when I 2 > 50% or P < .10, there was heterogeneity, and a random effect model was used. The relationship between different genotypes of the 3 loci and cancer susceptibility was heterogeneous, and the random-effect model was used. As shown in Tables 3-5. P < .05 indicated that the genotype distribution frequency difference in the case control group was statistically significant, and the polymorphism of this site was correlated with cancer susceptibility.

Subgroup analysis data.
In the subgroup analysis, Arg399Gln was significantly correlated with gynecologic cancer susceptibility in the Asian race (P < .05). Arg194Trp was significantly associated with gynecologic cancer susceptibility in Asian ethnicity (P < .05). All ethnic groups of Arg280His In cervical, ovarian and endometrial cancers, Arg399Gln was statistically significant with cervical cancer susceptibility (P < .05 in each gene model). Arg194Trp was statistically significant with endometrial cancer susceptibility (CC vs TT, CC vs CT, P < .05). All significance P values (P < .05) were sorted, benjamin-Hochberg corrected, FDR < 0.05, the probability of false positive was low, P value was statistically significant. The meta-analysis and subgroup analysis of forest map are shown in Figure 2.

Sensitivity analysis
Sensitivity analysis was performed on the relationship between XRCC1 Arg399Gln GGvs AA and susceptibility to gynecologic cancer. After 1 article was removed in turn, no significant changes were found in the effect scale of 31 articles, and the results were still within 95% CI (95% confidence interval).
In the sensitivity analysis of Arg194Trp CCvs TT in XRCC1 and susceptibility to gynecologic cancer, after 1 article was removed in turn, no significant changes were found in the effect scale of 16 articles, and the results were still within 95% CI (95% confidence interval).
Sensitivity analysis of the relationship between XRCC1 Arg194Trp CCvs TT and susceptibility to gynecologic cancer. After 1 article was removed in turn, no significant changes in the effect scale were found in 6 articles, and the results were still within 95% CI. As shown in Figure 3.

Publication bias
The funnel plots of XRCC1 Arg399Gln GGvs AA and the susceptibility of the gynecologic cancer showed certain publication bias, and the funnel plot nodes formed relatively uniform funnel shape, indicating small publication bias, as shown in Figure 4. Begg's Test N = 31, z = 0.99, Pr > jzj = 0.32, Egger test P > jtj = 0.187, P > .05 indicates no significant publication bias.
Funnel plot of Arg194Trp CCvs TT in XRCC1 and the susceptibility of the gynecologic cancer showed significant publication bias, as shown in Figure 4. Begg Test N = 14, z = 1.31, Pr > jzj = 0.189. Egger test P > jtj = 0.056, P > .05 indicates no significant publication bias.     Zhang and Li Medicine (2021) 100:50 www.md-journal.com Table 5 Summary ORs of the XRCC1 Arg280His polymorphism and gynecologic cancer risk.      The funnel plot of Arg280His GGvs GA of XRCC1 and the susceptibility of the gynecologic cancer showed significant publication bias, as shown in Figure 4. Begg Test N = 6, z = 2.63, Pr > jzj = 0.009. Egger test P > jtj = 0.017, P < 0.05 indicates publication bias. All data are shown in Table 6.

Ethics and dissemination
The literature collected by the Institute is derived from published academic literature in a professional network database, and the data used in statistical analysis can be obtained from these publicly published papers, so the study does not require ethical approval.

Discussion
The occurrence and development of cancer is the result of a combination of multiple factors, including genetic inheritance, hormone levels, inflammatory factors and dietary habits. Over the past decade, advances have been made in the pathogenesis of gynecologic tumors and in anticancer therapies. However, the 5-year survival rate is still very low, so it is important to find new molecular markers that can be used to predict the risk of cancer. XRCC1 gene encodes a homonymous protein that is an important functional protein in the process of singlestrand DNA damage. Mutation of the XRCC1 allele is associated with reduced DNA repair ability and prolonged cell cycle. XRCC1Arg399Gln non-synonymous mutations cause amino acid sequence changes that affect protein function and DNA repair ability, and may affect the interaction with other DNA repair proteins, leading to an increased risk of tumor. [55] Wu [56] detected XRCC1 mRNA expression in peripheral blood of patients with ovarian cancer in the platinum sensitive group (19 cases), part of the platinum sensitive group (25 cases) and the platinum resistant group (22 cases). Results: The expression level of XRCC1 protein in platinum-resistant group was higher than that in platinumsensitive group (P < .05). This indicated that XRCC1 gene expression in peripheral blood may affect the sensitivity of cisplatin chemotherapy for ovarian cancer.   The current polymorphism studies of XRCC1 mainly focus on 3 non-synonymous mutations SNPs, which are Arg194Trp, Arg399Gln and Arg280His respectively. The relationship between XRCC1 polymorphism and susceptibility to gynecological malignant tumors has been reported for many times, but negative reports are also common. [57][58][59][60] Therefore, there is still no consensus on the relationship between XRCC1 polymorphism and gynecological malignant tumors. Different research designs, different test methods, the number of samples and the difference in population distribution will inevitably affect the experimental conclusions. As a powerful tool, meta-analysis can overcome the above factors, analyze data and conclusions, and provide a basis for subsequent research.
Heterogeneity is an important part of meta-analysis, and it is very important to understand the source of heterogeneity, [61] which can often provide us with ideas for solving problems. In our meta-analysis, heterogeneity was mainly derived from Khokhrin, Sonali Verma, Jakubowska. [31,35,50] All 3 studies were of ovarian cancer in European, non-Asian populations, with population as a major factor.
In the TCGA dataset, it can be found that XRCC1 has the highest mutation rate in Uterine Corpus Endometrial Carcinoma samples, Cervical squamous cell carcinoma and endocervical adenocarcinoma has a relatively high mutation rate, and ovarian cancer has a low mutation rate, indicating that the mutation of this gene may lead to endometrial cancer and cervical cancer. The deficiency of this study is that there are only 6 references on endometrial cancer included, of which only 4 are in line with Hardy-weinberg equilibrium, and the sample population is all Asian. To solve the above problems, more samples and multi-ethnic studies are needed in the follow-up studies.
XRCC1 Arg399Gln, Arg194Trp, Arg280His single nucleotide polymorphisms were associated with gynecologic cancer susceptibility. Arg399Gln was statistically significant with cervical cancer susceptibility. Arg194Trp was statistically significant for susceptibility to endometrial cancer. XRCC1 genotype detection at each site is expected to be a molecular marker for gynecologic cancer screening.